by William R. Stanek
At times it seems the whole world has joined the information revolution. Computers are a part of our everyday lives. Globally, most businesses and hundreds of millions of people own or have access to a computer. Most businesses with more than ten employees have several computers. Often, the computers are connected together in a local area network. Networks can boost productivity, make it easier to communicate without having to leave the desk, and improve the way companies do business.
A growing number of companies have large corporate networks. The company's Wide Area Network may connect hundreds or thousands of computers. A recent trend is to integrate the capabilities of the Internet and the World Wide Web into the corporate information infrastructure.
Thousands of savvy business owners and technical people have recognized the potential for tremendous cost and time savings and have set out to merge years, months, or weeks of existing company publications into the distributed structure of the Internet. The goal of moving legacy publications to the company intranet or Web server is to save time and money, which improves the bottom line. After all, in the end, the bottom line is the only thing that matters.
Every time there is a product release, product update, or press release, documents are created and distributed to company personnel and customers. This mountain of paper and electronic documents grows with the company.
Over time, the value of the company's legacy documents becomes increasingly apparent, especially when it is necessary to distribute historical or promotional information to large groups of employees or customers. The question soon becomes "How do we tap into all the valuable resources employees have created over the years?" The answer is to integrate key documents into the company's information structure, but before you can move these important documents to your intranet or Web server, you must find them.
Imagine a file cabinet in your office stuffed with cryptically labeled or unlabeled documents, video tapes, music cassettes, and photos, and you have an accurate picture of how the majority of users and organizations store their data in computer systems. Many important documents are in personal directories on hard drives that no one can access, and secretaries spend a lot of time passing floppy disks about the office. Some key documents that could benefit everyone are stored in a location that only one person knows about.
The key to sorting through this mess of documents is getting organized. A clear structure for your directories and files becomes increasingly important as the company grows. Hundreds of files in a single directory are difficult to track and maintain. Therefore, you should carefully consider how to organize both files and directories.
Part of your plan should include procedures for labeling documents. After all, organized information is only useful if you know what the documents contain and what the documents are used for. You should also ask personnel to identify documents that should be made available online. Ideally, these documents would be identified using a priority system based on the usefulness of the documents to particular departments, employees, the company as a whole, and your customers.
After you decide how information should be stored and labeled, turn your ideas into policies that everyone in the company can understand and follow. Then distribute these data storage policies throughout the organization and ensure that everyone knows this is part of your effort to make the company's legacy publications available online, which will in the end save everyone time and money.
Once your data storage policies are in place, organize your documents so they are easy to find and use. The benefit of this is three-fold. Finally, you can get started with the real work of making the documents available online, and when you are finished, you don't have to worry about tracking hundreds or thousands of new files through a maze of file cabinets, desk drawers, and hard drives. Further, you can now easily identify the different file formats that you have to deal with in your conversion effort.
After you identify the file formats you need to convert, you can look for conversion solutions. In an increasingly Internet-aware world, many commercial word processors and publishing environments can save documents in a format that you can use directly on the Web. This means that sometimes your conversion effort is as simple as opening the document you want to move to the Web in the application in which the document was created and saving the document as an HTML-formatted file.
In Chapter 10, "Publishing with Microsoft Internet Assistants", you learned that you can convert Word, Excel, PowerPoint, and Schedule+ documents directly to HTML. Similar solutions exist for other applications, as well. For example, if you use Corel's WordPerfect, you will find that you can convert your documents directly to HTML using Internet Publisher. Internet Publisher is fairly similar to Internet Assistant. Versions of Internet Publisher will soon be available for most applications in Corel's Office Suite.
Though being able to convert your word processor's document format directly into a Web-usable format saves time and money, direct conversion is not always possible. Sometimes you have to convert documents to an intermediate format such as PostScript. Once the legacy file is in the PostScript format, you can integrate it with an Adobe product to create a document in Adobe's Portable Document Format. A growing number of Web-published documents are in PDF.
As discussed earlier in this book, HTML is based on a specific SGML Document Type Definition (DTD). Other DTDs are also available, or waiting to be written, to meet your needs. There are many mature SGML products from which to choose. New and improved products for the Web are announced every month. Common Ground is one example of a recently developed environment that provides a balanced suite of tools to create Web documents from legacy publications.
As detailed in Chapter 9 "An HTML Toolkit: Browsers, Converters, and Editors," tools for converting your documents to Web publications abound. Converters basically enable you to convert your legacy data to another format. Various implementations of converter technology have been applied effectively to create Web documents using software products not originally intended for this purpose.
This is not a new concept. When you send a document to a printer, the document is converted into a format the printer understands. For example, when you write a document using Microsoft Word, Word's file format stores not only content but also formatting commands. When you print the document, a print driver (or Word itself) converts the document's content and formatting commands into a format your printer understands, such as PostScript (PS) or Encapsulated PostScript.
Most word processing programs can also create a print file and store it to disk. The programs and filters associated with this process are similar to the filters used to create Web-formatted documents. Figure 43.1 shows the conversion process.
Figure 43.1 : Document conversion from native format to Web format.
You can apply filter methods to your Web publishing solution in many ways. Theoretically, you can generate a Web document from any formatted document. You can write programs to search for key characteristics and to replace them with a Web formatting tag. You even can parse a flat text file in this way by searching for things like line breaks, headings denoted by Roman numerals, single lines of text, and so on.
The mapping between native document formatting and the target Web format (HTML, SGML, and so on) must be consistent, however. The feasibility of using a custom program diminishes when the original documents have different internal structures.
Tip |
Although developing custom software can be an expensive option, this is not always the case. A myriad of inexpensive options for processing documents exists. Developing a program to convert a consistently formatted set of files to a format like HTML is relatively straightforward. In some cases, scripting languages like sed and awk may fulfill your needs. Other options include using the macro capabilities of most commercially available word processors to insert Web tags into your documents. |
Microsoft Word's use of document templates (DOT files) lends itself well to filter and conversion routines. By following a well-defined template when creating a Word document, thus maintaining a consistent structure, you can easily convert the Word formatting codes to Web format codes. Methods of converting a Word document to Web format include using macros to replace and insert Web formatted tags before saving a document or running a conversion program on the Word file after saving it.
In either case, this type of conversion involves post-processing, meaning that you must check the document for accuracy after using the converter. Standard Web-formatting errors are detected only after you edit your document. Having to recheck and possibly reprocess your documents after using a converter is characteristic of all filter-based document technologies.
One way to recheck documents is simply to display them in your Web browser and ensure that everything is displayed as it should be. You can also use a program to check the documents for accuracy, such as Quarterdeck's WebAuthor. Figure 43.2 shows the first window that appears in WebAuthor when you launch it from Word. As you can see, WebAuthor has a specific setting for opening HTML documents created in Microsoft Word.
Figure 43.2 : Quarterdeck WebAuthor initial window.
If you select Open an HTML authored Word document, and then select the filename of your document, WebAuthor opens the file. As it opens the file, it checks the consistency of the document to ensure it is properly formatted HTML. While WebAuthor will not catch all errors, it will catch most of the conversion errors. Figure 43.3 depicts the type of errors WebAuthor reports if it finds poorly formatted HTML code.
Figure 43.3 : QuarterDeck WebAuthor parse error window.
What WebAuthor has actually flagged as "bad syntax" is the #char notation. Is WebAuthor wrong? Well, the error as reported is somewhat misleading but is still useful if you take into consideration that the error was found while parsing for HTML tags. WebAuthor interpreted the special character # within <A HREF...</A> as non-compliant. What you need to do is move forward with your work by selecting one of the three options Quarterdeck provides for you: remove the tag, edit it and reparse, or exit. Ignoring the "error" is not an option in this case because of the validation criteria built into WebAuthor.
The validation of Web documents after conversion is an essential post-processing activity. This fact can significantly affect costs in converting a large number of documents, even when small discrepancies are found after a document has been converted.
Beyond WebAuthor, you will find dozens of converters that will handle the output of almost every word processing environment. Products like InterLeaf, FrameMaker, and other top-end publishing environments offer either HTML filter and conversion programs or access to third-party providers of such programs.
Two notable applications that make conversion from word processor formats to Web-usable formats easier are the Adobe Acrobat and Common Ground product suites.
The Adobe Acrobat suite of products includes a number of programs for creating and displaying portable documents in Adobe's PDF. Two key programs in this product suite are Adobe Exchange and Adobe Distiller. Adobe Exchange and Adobe Distiller were originally developed to convert PostScript files to electronic hypertext documents; they are now being integrated into Web publishing environments with hypertext PDF files as the target end product. Their ability to more closely approximate a magazine format than HTML or SGML is a strong point that has attracted many mainstream publishers. An example of the Adobe Acrobat Reader format appears in Figure 43.4.
Figure 43.4 : Adobe PDF file displayed in Acrobat Reader.
Common Ground is similar in concept to Adobe Acrobat. Common Ground converts your documents to a Digital Paper (DP) format that you can view with their browser, called MiniViewer, on any platform. You can use any publishing suite you want and keep all the composition capabilities you are used to. The online Common Ground document shown in Figure 43.5 looks exactly as it would if you printed it, but you can also include hypertext links in it. The Common Ground MiniViewer is freely distributed, as is the Adobe Acrobat reader, and is catching on quickly on the Web.
Figure 43.5 : Digital Paper file displayed in Common Ground.
With both Adobe Acrobat and Common Ground product suites, you can use your favorite word processor to create documents and then convert them to their portable document format. For example, if you purchased Adobe Acrobat Pro or Adobe Exchange, you can use the Adobe PDFWriter print utility to convert your word processor documents to PDF documents simply by printing them using PDFWriter. In Microsoft Word, the conversion process is as easy as selecting PDFWriter for your printer when you print a document.
Adobe Acrobat presents magazine-quality documents to users across multiple platforms by using the Portable Document Format. PDF is basically an extension of Encapsulated Postscript format with the ability to use hypertext linking. All you need to view PDF files is the Adobe Acrobat reader, which is currently available for
DOS
Macintosh
SGI
Sun SPARC and HP UNIX platforms
Windows
Adobe Acrobat is a family of products. While Adobe Acrobat Pro offers the most comprehensive publishing solution, the best value is Adobe Acrobat Exchange. With a retail price of $150, Adobe Acrobat Exchange represents a substantial cost savings over Acrobat Pro, which retails for nearly $600. Most Web publishers will find that Adobe Acrobat Exchange enables them to do everything they would like to do with Adobe's PDF documents. For that reason, this section focuses on Adobe Acrobat Exchange and an extension of Adobe Acrobat Exchange called Adobe Amber.
Adobe Acrobat Exchange enables you to create documents in Adobe's PDF. Using Exchange, you can view, print, annotate, build navigational links into, and add security controls to PDF files. Adobe Acrobat Exchange includes an extremely useful print utility called Adobe PDFWriter. In your word processor, you can use PDFWriter as your printer of choice, which will create a PDF document. Another reason to purchase Adobe Acrobat Exchange is that you can increase the functionality of applications using free plug-ins available at Adobe's Web site:
http://www.adobe.com/acrobat/plugins.html
Here's a list of some of the free plug-ins:
Movie enables Acrobat Exchange users to add QuickTime and AVI video files to PDF documents.
WebLink enables Acrobat Exchange users to add World Wide Web (URL) links to PDF documents and follow those links to PDF or HTML files anywhere on the Internet.
SuperCrop adds to the toolbar a new crop tool that looks and works like the Adobe Photoshop crop tool.
SuperPrefs adds new preferences to Acrobat Exchange, such as
AutoIndex enables users to set up auto index features.
OLE Server enables Acrobat Exchange to act as an OLE server to view PDF documents embedded in other OLE-capable applications.
Monitor Setup gives users better control over color.
Adobe Amber is intended to be an update for owners of Adobe Acrobat Exchange and the free Adobe Acrobat reader. In reality, Amber is an Internet-friendly extension of the reader capabilities of Adobe Acrobat products. Although Amber uses some of the Adobe Exchange plug-ins just described, it also has unique features that make it very Internet-friendly. If you have ever become impatient trying to download a large PDF document and wished there were some way you could preview the document as it was downloading, Amber is what you have been looking for.
By combining the built-in features of Amber with an Amber-friendly Web server, Web users finally can view PDF documents one page at a time as the document is downloaded. The capability that makes Web servers "Amber-friendly" is byteserving, the capability to serve a document in byte-sized chunks. The byteserving capability has already been integrated into the Netscape and Open Market server products. Other servers can become Amber-friendly using a CGI script, which Adobe plans to freely distribute.
More good news about Amber is its capability to use URLs within PDF documents. Amber can also be used as a plug-in for the Netscape Navigator 2.0. Currently, Amber is available for free and you can obtain it from
http://www.adobe.com/Amber/
Common Ground is a family of products for creating portable documents. Common Ground presents magazine-quality documents to users across multiple platforms using the Digital Paper format. All you need to view DP files is the Common Ground ProViewer or the MiniViewer. The free MiniViewer is currently available for Windows and Macintosh systems. Viewers are also being developed for UNIX platforms.
You can print Common Ground documents in hard copy form from any system, and they will look exactly as the author intended. This feature represents a significant advantage over many HTML-based browsers that are still working on printing strategies. Common Ground documents can be distributed as executable (EXE) files on diskettes, CDs, networks, and file servers, or by modems or electronic mail. This capability means that many publishers can publish both hard copy and soft copy documents in essentially the same step. It also means that Common Ground documents can be distributed to electronic audiences who are not Web-enabled.
Common Ground works by taking a document created with any Windows or Macintosh application and converting it into a Digital Paper file. This is done using the application's print capability with Common Ground Maker serving as the printer driver. A Common Ground utility called Maker prints your document to a specified disk drive, directory, and file name. The result is a Common Ground document that can be viewed and printed by the Common Ground ProViewer or MiniViewer.
If needed, Maker can combine the document and a MiniViewer into a single executable file that users can view and print even if they do not have Common Ground. The catch is that if you create an executable file on a Windows system, only Windows users will be able to use the file. Similarly, if you create an executable file on a Macintosh system, only Macintosh users will be able to use the file.
Invoke Maker from an application by selecting the Maker printer driver and executing the Print command. Figure 43.6 shows Maker's pop-up Print window.
Figure 43.6 : Common Ground Maker window that appears after you select Print.
Maker has the following features:
FrameMaker is a huge success as the document system of choice of thousands of corporations, educational institutions, and scientific organizations, especially in UNIX environments. This is largely because FrameMaker is one of the most advanced document systems available. It is also one of the most expensive document system options; FrameMaker has a retail price of between $900 and $1,600, depending on the type of system you are using and the number of licenses purchased. The expense of using FrameMaker as your publishing solution for a distributed networked environment seems astronomical compared to alternatives, primarily because with FrameMaker there is no equivalent reader program, and if there were one, you can be almost certain it would not be free.
Without a reader program, FrameMaker is not a portable document solution. If you want to convert your FrameMaker-formatted documents to a Web-usable format, however, solutions are available. Several FrameMaker-to-HTML conversion products have come on and off the market. Many of the products were freeware or shareware products that have since faded from the Internet or are in limited use.
An exception is a commercial-grade conversion suite, originally developed by CERN and subsequently supported by Harlequin, named WebMaker 2.2. Harlequin's WebMaker 2.2 is an inexpensive Web publishing solution for creating full-featured Web pages from existing FrameMaker documents. It is available for Windows 3.x, UNIX, and Macintosh.
CERN developed WebMaker to convert scientific publications written in FrameMaker into HTML format. After WebMaker became available to the public in mid-1994, market demand prompted CERN to select Harlequin to continue developing, supplying, and supporting WebMaker.
With WebMaker, you map FrameMaker tags to HTML tags through a graphical user interface. Error checking alerts you to unmapped tags. WebMaker offers very precise control over FrameMaker-to-HTML mapping. In the reviewed version of WebMaker, there are 169 rules pertaining to the formatting of paragraphs. Rules for paragraphs encompass all primary text elements in your documents, including headings, footnotes, and tables.
A sample of a FrameMaker-to-HTML mapping sequence using WebMaker appears in Figures 43.7 and 43.8. Figure 43.7 shows how WebMaker flags undefined rules. Figure 43.8 shows the dialog box you would use to map rules.
Figure 43.7 : WebMaker flags undefined rules as errors.
Figure 43.8 : Defining a mapping rule.
Download WebMaker 2.2 for evaluation or purchase from the WebMaker pages at Harlequin's Web site:
http://www.harlequin.com/webmaker/
Using WebMaker, you can do the following:
TeX is another document system that enjoys widespread use in corporations, educational institutions, and scientific organizations. TeX's most highly regarded feature is its support of advanced mathematical equations. TeX has a highly structured language syntax, which forms the basis of many popular TeX derivatives such as LaTex. Publishers with TeX conversion requirements should review both TeX-to-LaTeX conversion issues and LaTeX-to-HTML conversion issues before committing to a solution.
LaTeX2HTML is a conversion tool that makes it possible for you to convert documents written in LaTeX (or converted from TeX to LaTeX) into HTML. LaTeX2HTML recreates the basic structure of a paper document as a set of interconnected hypertext nodes that can be explored using automatically generated navigation panels. Any defined annotations, such as cross-references, citations, or footnotes, are converted into hypertext links. Special formatting information, such as special font character mappings for mathematical equations, is converted into gif images that are placed automatically in the hypertext document.
Note |
You must use embedded gif images for converted mathematical equations to present accurately the author's intended notations to users. Remember that HTML only parses data, not formatting information. |
LaTeX2HTML is being widely used for preparing electronic books, documentation, scientific papers, lecture notes, training and coursework material, literate programming tools, bibliographic references, and much more. LaTeX2HTML will run on most UNIX systems, but it requires at least Perl version 4. If you've upgraded to Perl 5, there is a version of LaTeX2HTML compatible with Perl version 5.
You can get LaTeX2HTML via FTP from
ftp://ftp.tex.ac.uk/pub/archive/support/
Databases are at the heart of most information systems, and your Web-based information system should be no different. Databases come in many forms.
Often databases are in the form of flat files that are updated by hand. A list of addresses or business contacts in a text document is an example of a flat file. If you maintain flat databases, chances are good that your system administrator has created a program that can extract information from the flat file, such as a person's home address or phone number. Although such databases are simple, they are useful and practical, especially in UNIX environments.
Because flat-file databases are usually formatted as standard ASCII text, using them on your Web or Intranet server is easy. All you have to do is make the file available by moving it to an appropriate directory on the server. Then, using a standard Web browser, anyone can display the file as a text document.
If you have seen text files displayed in a browser, you know that the formatting is not especially appealing. Therefore, you may want to convert the text file to HTML using a converter such as ASCII2HTML or RTf2HTML.
To maintain the usability of the flat file database, you may also want to create a search form users can submit to search the database for entries. You will want to parse the output of the form using a CGI script, and then pass the result to a script that searches the database. Fortunately, you probably have existing search scripts that you can use for this purpose immediately, especially if you use UNIX.
More complex company databases are probably maintained on commercial database systems. The problem with commercial database systems is that they use proprietary formats and interfaces, which makes it extremely difficult to convert your data. The good news is that major database vendors, like Oracle and Sybase, have pulled out all the stops to ensure that their database systems can take advantage of Internet technologies.
Oracle developed a product called WebSystem that consists of Web server and client products that easily integrate with your existing Oracle databases. The key component of WebSystem is Oracle WebServer. The Oracle WebServer provides a complete Web publishing and database solution that combines the power and reliability of Oracle7 Enterprise or Universal Server with the capabilities of the Web.
Sybase offers several solutions for moving your databases to the Web. Their showcase solution integrates Sybase with Silicon Graphics servers to create a powerful Web-ready database solution called WebFORCE chALLENGE. WebFORCE chALLENGE features specific modules for conducting online inventory management, transaction management, customer tracking, and marketing using Sybase.
Once you obtain an Internet solution from your database vendor, you will be able to directly access your database on your intranet or Web server. Direct access to your database ensures that the data remains usable and functional on the Web.
Spreadsheets are another valuable type of legacy publication that you may want to make accessible online. Before you move a spreadsheet to the Web, you should ask whether users will want to view or edit the data in a particular spreadsheet. The answer to this question is extremely important.
Creating viewable files out of existing spreadsheet data is easy. One solution is to open the spreadsheet in the application in which it was created, such as Excel or Lotus 1-2-3, and save the spreadsheet as a standard text file with the .txt extension. To do this, select Save As from the application's File menu. Next, as shown in Figure 43.9, select the file type. Keep in mind that any formulas embedded in the spreadsheet are generally not saved in the text file. After you move the file to a directory on the server, anyone will be able to view the spreadsheet data as standard text in a Web browser.
Figure 43.9 : Saving a spreadsheet as text in Microsoft Excel.
If you think about it, the columns and rows of a spreadsheet are really the same as the columns and rows of a table, which makes converting a spreadsheet to HTML a snap. Your spreadsheet displayed as an HTML table will be much more appealing than a standard text file. Using the text file you just created, you could add the appropriate HTML markup to create a file containing an HTML table.
Although you could add markup to a basic spreadsheet in a matter of minutes, you probably do not want to convert lengthy spreadsheets by hand. Fortunately, an increasing number of spreadsheet applications have add-ons that support direct conversion of the spreadsheet to HTML: Microsoft has Internet Assistant for Excel, Corel has Internet Publisher for Quattro Pro. Using these conversion tools, you can convert dozens of existing spreadsheets to a Web-viewable format in minutes.
Like database files, spreadsheet data usually needs to be presented on the Web in a more dynamic format than a static page. After all, what good is a spreadsheet if you cannot manipulate it when you need to?
Displaying a spreadsheet in an editable format requires a little more work than converting the spreadsheet to a viewable format, and a lot more ingenuity. To display the spreadsheet in an editable format, you will use the spreadsheet application itself to display the spreadsheet.
The first step is to create HTML pages that reference your spreadsheets. This requires that you reference the spreadsheet by name, such as in the following code:
<H2>Spreadsheets</H2> <P><A HREF="1q96_earnings.xls">First Quarter '96 Earnings</A></P> <P><A HREF="2q96_earnings.xls">Second Quarter '96 Earnings</A></P> <P><A HREF="3q96_earnings.xls">Third Quarter '96 Earnings</A></P> <P><A HREF="4q96_earnings.xls">Fourth Quarter '96 Earnings</A></P> <P><A HREF="ytd97_earnings.xls">Year-To-Date '97 Earnings</A></P>
Launching the spreadsheet application when the user clicks a hypertext link to a spreadsheet data file requires updates to configuration files on your server and client browsers. These updates ensure that your server and browser correctly identify spreadsheet data files and launch the appropriate application to display the files.
Because spreadsheet data files should end in unique extensions, you need to configure your server to send files with these extensions as new MIME types. On most servers, MIME types are stored in a specific configuration file called mime.types or mime.typ. You need to edit this file and add entries for each spreadsheet application used on your network.
If you use Microsoft Excel, you will add this MIME type entry to the end of the mime.types files:
application/msexcel xls xcl
If you use Lotus 1-2-3, you will add this MIME type entry to the end of the mime.types files:
application/lotus wks wk4
Note |
The exact number of spaces between the MIME type and the extension designator does not matter. MIME types are broken down into basic categories, such as application. The application type identifies binary data that can be executed or used with another application. Each data type category has a subtype associated with it. MIME subtypes are defined as primary data types, additionally defined data types, and extended data types. The primary subtype is the primary type of data adopted for use as MIME content types. Additionally defined data types are additional subtypes that have been officially adopted as MIME content types. Extended data types are experimental subtypes that have not been officially adopted as MIME content types. |
After you save the MIME configuration file, you should restart your server. You are now ready to configure your browser to launch the spreadsheet application as a helper application. Do this by setting preferences from an options menu within the browser. In Netscape Navigator, when you select General Preferences under the Option menu, you can use the dialog box shown in Figure 43.10 to configure helper applications.
To set a helper application in Netscape Navigator 3, click the Create New Type button. This action opens the Configure New MIME Type dialog box in which you can enter the MIME type and subtype. In the MIME Type field, enter the keyword application. In the MIME SubType field, enter the MIME subtype, such as msexcel or lotus. Once you enter this information, click the OK button to close the Configure New MIME Type dialog box.
In the File Extension field of the dialog box shown in Figure 43.10, enter the file extension for the spreadsheet application in a space-separated list, such as wks wk4. In the Action field, select the radio button labeled Launch the Application. Next, click the Action field's Browse button to open the Select an Appropriate Viewer dialog box. Using this dialog box, search through your file system until you identify the file path to the binary executable for the spreadsheet application. When you are finished, click the OK button. After you double-check the accuracy of your helper application entry, click the OK button of the Preferences dialog box.
That's it! Now your server and browser are properly configured to launch the spreadsheet application when you need it. Keep in mind that if multiple copies of browsers are separately installed on the network, you will need to configure those browsers as well.
Making legacy publications available on your intranet or Web server gives employees and customers instant access to the information they need to make decisions. Before you start moving files, you should develop a clear plan that begins with getting the company's data organized. After your plan is developed and distributed to company personnel, you can start moving legacy publications to the Web. Usually you will want to move files on a prioritized basis that ensures that the most useful legacy data is moved first.
In general, your legacy data will be in one of three formats: document, database, or spreadsheet. Although many document formats are easily converted to HTML, converting database and spreadsheet files often requires some forethought. Thankfully, in an increasingly Internet-aware world, most software vendors are developing tools that make it easier to move legacy publications to the Web. If a specific tool is not available to display the document in a usable format, you can-as you saw with spreadsheets-launch applications directly by configuring your server and browser to recognize and properly handle additional MIME types.