by John Jung
People, and companies, often create Web pages that target a single topic. Web pages aimed at fans of a particular star, or show, are widely accessible on the Web. Companies create Web pages to promote themselves and possibly help their customers. Often it is useful for the visitors of these sites to be able to talk together. This chapter explains how you, as a Web publisher, can make such discussions possible.
Probably the first question that most readers have is, "Why do I want to use a Web-based discussion group?" The answer is quite simple: To allow many people to talk to each other about a common topic. Web browsing is one of the fastest growing activities on the Net. Traditional activities, such as FTP, e-mail, and Usenet, are still being accessed, but they are not growing as fast as the Web. Consequently, there is a need for Web-based discussion groups. Two main groups of Web authors can make use of a discussion group: the fans of a celebrity, or show, and organizations, such as companies.
Although many people enjoying gawking at celebrities or chatting about television shows, some people are more enthusiastic than others. It's these people who would most likely create a Web page just for their favorite celebrity or show. Many such Web pages are very straightforward, offering any information the Web author has about the subject. Extremely popular celebrities or television shows may have large numbers of such fans who want to talk about their fondness for a particular subject.
Although Usenet newsgroups can be created for the purpose of discussing such subjects, not everybody has access to such groups. The reason these groups don't have that much exposure on Usenet is because of their transient nature. Once a show is cancelled or a star loses his or her appeal, such newsgroups typically fade away. Many Usenet administrators won't carry such newsgroups precisely for these reasons.
A good alternative to this type of Usenet newsgroup is a Web-based discussion group. In such a board, fans can find, or give, information on the celebrity or show. Whether it's the latest gossip or a report on some related news article, such information can be incorporated into the Web page itself. This result brings a certain sense of pride to the participants of a Web-based discussion group. The only problem with a Web-based discussion group is that it doesn't get the exposure that other discussion forums do. A Usenet newsgroup can reach millions of readers; a Web discussion group can only reach whoever accesses the page. However, a Usenet newsgroup won't necessarily be carried by all sites at all times. A Web-based discussion group is carried by the Web author for as long as he wants. If he's particularly enthusiastic about his Web page, it can stay up forever.
Every corporation wants to know what its customers are thinking. After all, if it knows what the people want, it can try to provide it for them. A Web discussion group can be a good way to get this information. Corporate-based discussion groups tend to focus on the corporation where the Web site is located. Chances are, if the Web site is for Ed's CompuHut, the discussion group will focus on the store. Customers can post their opinions about a particular product or salesman, questions about using software, or compliments. Similarly, the employees at the corporation can help customers with problems, respond to complaints, and provide late-breaking news about the company. This type of discussion group provides one of the most open channels of communication for a company. It allows them to be insulted and praised on their own systems.
Usenet is a collection of newsgroups, each of which contains news articles. These general groups are broken up into more specific groups where the content is defined and regulated by its participants. There are a number of problems with using a Usenet-style newsgroup for Web pages. First and foremost is that you need special user privileges to create and maintain each group. For fan-based Web pages, this approach is generally unworkable.
Another problem with the Usenet model is that the specific newsgroup is accessible by a limited number of computers. When an article is posted to a newsgroup, it's handed off to other computers that are capable of recognizing that newsgroup. The problem with narrowly-focused newsgroups, such as for a company, is that most computers outside the company won't carry it. It's not that the noncompany computers necessarily have a problem with the content; it's just that there's little or no demand for those groups. Consequently, creating a Usenet-style discussion group for a Web page is a tremendous amount of work. Aside from the privileges problem, there is the problem of convincing many systems administrators to carry a single Usenet-style group.
Because of these two problems, it is extremely difficult to use Usenet-style newsgroups for a Web-based discussion group. The best method of allowing people to talk about a particular topic is to use a Web-based discussion group. Other options, such as mailing lists or restricted newsgroups, are basically derivatives of a Web-based discussion group.
Before creating a discussion group, you have to decide who should, and shouldn't, be able to post to the board. This choice isn't as obvious as it may seem; certain conditions must be weighed. On the one hand, you can have unrestricted access, which means anybody can post to it. This is similar to most Usenet newsgroups, where anybody can subscribe to a group, and post to it. With restricted access, only authorized people can post information. This approach is much like private chat areas, where only invited people can participate. Related to deciding who can post is the decision as to whether the discussion group should be moderated. Moderated discussion groups are sort of a middle ground between completely open access and completely restricted access.
One of the best aspects of Usenet is its complete openness. Because anybody can post anything he or she wants, all opinions are heard equally. Consequently, if a point is argued or defended well, the author will probably command respect. This atmosphere makes it possible for minority opinions to be exposed to everybody at large. In this respect, unrestricted access is great for allowing everyone a complete perspective on any given topic. Another benefit of an open-ended discussion group is the many different viewpoints from other participants. If you have a problem with something and you live far from civilization, an unrestricted discussion group can bring you a great deal of help. You don't necessarily need to drive many miles just to get an opinion; you can simply post your query to the board. Whoever reads the board may respond with some useful answers.
The obvious downside of an open-ended discussion group, such as Usenet, is the easy abuse of it. People can easily post topics that are unrelated to your board. Anybody who's read Usenet for very long knows how annoying chain letters and advertisements can be. Although they are less likely to appear, these types of messages are also possible on Web-based discussion groups. Another downside of a discussion group such as Usenet is that everybody can post. Without someone to regulate what's posted, there is little accountability. Anybody can write to your board and disrupt the ongoing discussions. Unrestricted access is a double-edged sword; on the one hand, you let in many viewpoints. On the other hand, you have to deal with possible abuse from the throngs of potential posters.
Restricted access to a discussion group usually takes the form of a password-protected Web page. Only people with accounts that can bypass the Web page can post to the board. This type of discussion group is best suited for service-oriented companies. Customers of the company, as well as employees, can be given accounts in the protected Web page. As a result, both the vendor and customer can privately discuss issues in peace. Additionally, because the Web page is password-protected, the customer's competition won't be able to read proprietary information.
On the whole, restricted discussion groups tend to be more focused. Depending on the criteria for restricting access to a discussion group, the content can be very helpful. The downside of a restricted discussion group is a lack of more diverse information. Because the discussion group participants have been filtered already, there are fewer voices to be heard. This means that most mainstream opinions are heard, but few extremes are. If you post a query to a restricted discussion group, you may or may not get an answer. If you send that same message to an unrestricted board, you'll almost always get a reply-right or wrong.
Suppose you posed an obscure trivia question about a television show to a restricted discussion group for that show. If nobody with access knows the answer, you won't get any help. However, if you post that question to an unrestricted discussion group, your chances will improve. Another down side to a restricted discussion group is the problem of stagnation. After a while, all the old-timers tend to dominate the topics. Because the discussion has fewer participants, there tend to be fewer new topics. Most of the old-timers have already expressed their opinions to each other, and few want to explain themselves again. Open discussion groups tend to have a constantly changing mix of new participants.
The middle ground between unrestricted access and completely restricted access is a moderated board. A moderated discussion group is one where anybody can participate in the discussion group. However, when someone posts something, it's not posted directly to the discussion group. Rather the post is sent to a moderator. This person is the only one authorized to post anything to the discussion group. Although this person doesn't write the posts, he or she is the filter for the content of the board.
The primary advantage of having a moderated board is that you have the best of both worlds. You have access to a wide diversity of facts and opinions. At the same time, undesirables are kept from polluting your discussion group. Topics of discussions are relevant, and the participants won't (publicly) insult each other. Best of all, there are no disruptions from unrelated posts.
The big drawback of a moderated discussion group is that you need a moderator. Individuals who run discussion groups don't want to have to sift through and determine what is, and isn't, an acceptable post. For these individual discussion groups, a moderated discussion group isn't an available option. Large organizations, however, might find this approach worthwhile. It allows them to keep their discussion group focused on whatever they like. If the organization is a company, only posts concerning the company's products are likely to be approved. Praise for a competitor's product will also be rejected, for obvious reasons.
The majority of Web-based discussion groups have three basic components. The first is a table of contents, which shows some information about the messages themselves. The second component is some sort of mechanism that assigns each message its own unique file name. The individual messages are the final component. Generally, the individual messages are stored in their own directory to isolate them from other Web pages. The following sections describe the components and how they interact with each other.
Every Web-based discussion group has to have some place where the messages are referenced. The messages themselves are not stored in this place, only their URLs are. The table of contents for a Web board is often a Web page itself. Usually the subject, author, and corresponding URL of each article are displayed on separate lines. When someone wants to read a message on a Web-based discussion group, he'll point his browser to the table of contents. If a particular subject interests him, he simply has to click on the appropriate link.
In all likelihood, you don't want to keep all messages for a particular board in one file. After all, if there is a lot of posting, the file will grow exceptionally large. Additionally, if the single file were corrupted in some manner, you would lose all topics of discussion. Consequently, you should keep each post in its own file. Another component in Web-based discussion groups is some mechanism for creating unique file names.
The most direct approach in implementing this capability is to assign each message a number. This number can be saved in a plain text file for easy retrieval. When someone posts a message to the discussion group, you simply have to fetch the current message number. Next, save the newly posted message to a unique file name by using the message number. You can then update the file that holds the message number. On DOS-based machines, this naming scheme limits you to a maximum of 99,999,999 messages. If your operating system supports long file names, you can have significantly more messages.
Another approach to this problem is to dynamically assign each file name. To make it unique, you can use current date and time as part of the file name. Although this method makes the file names for each message a little complicated, it is a viable solution. This approach wouldn't require any extra file to hold the unique file name. However, the downside of this approach is that it's possible to have a conflict, which can result in a corrupt message. Also, it is, theoretically, possible for several people to try and save different messages at the exact same time. Certainly, the probability of this scenario happening is very remote, but it does exist.
The final component in a Web-based discussion group is the messages themselves. Because you want to keep each message independent of all others, you will need to keep each individual message. These messages are often stored in their own directory, so as to not clutter up a directory full of Web pages. Most of the time, each message is wrapped in some HTML code. The HTML code typically allows the user to go to the next, or previous, message. The capability to post a new message, or a reply to the current one, is also part of the HTML code.
Now that you know what all the parts of a discussion group are, it's time to put them all together. The table of contents is instrumental in making use of the other components. That is, the table of contents usually has links to a cgi-bin script. That script looks at the second component, the last uniquely assigned message number. Finally, the cgi-bin script creates the final component, the message itself, by saving the posted message. Although each component of a discussion group is important, you don't need to write cgi-bin scripts for each of them. It's possible to create a good cgi-bin script that can handle all the components.
Before creating a discussion group, you have to create a table of contents. Although having a table of contents of nothing might seem a little strange, it is necessary. The table of contents should be an HTML file, so that all the contents can formatted as links to the messages. When new messages are posted, this Web page has to be modified. At the very least, the information for the new post has to be added. The table of contents page should also allow users to post to the discussion group. You can provide this feature most easily by having a separate page to take the user's input.
The page that the table of contents page can point to is the posting page. This Web page has fill-out forms asking for the user's name, e-mail address, subject, and message body. Obviously, you'll also need to point to a cgi-bin script to handle the input. The only real purpose of this page is to not clutter up the other pages of the discussion group. Although you could embed this Web page with other discussion group-related pages, you generally don't want to. The contents of the posting page are rather extensive and usually take up a lot of space. This space could be better spent in storing information for more messages or showing more of the body of the message.
When the user is happy with what he's entered into the posting page, he clicks a button. This button should point to a cgi-bin script file that will process the input. It should check to make sure that none of the fields in the forms was empty. If anything was left out, the script should send an error message to the user. Although it'd be nice for the script to be able to check for valid inputs, such as a valid e-mail address, it's not realistically possible. From the Web, it's very difficult to accurately verify information entered by the user through form fields.
After the post has passed through the very simple checking process, the file has to be created. The cgi-bin script file should first read the message number file, and open it for writing. Next, the parsed message has to be printed out to the newly opened file. The form values should be formatted and sent to the file in a reasonable fashion. For example, create some sort of message header that holds the user name, e-mail address, and subject line. The body can be separated from the header by using an HTML horizontal rule. The purpose of nicely formatting the message is to make it easier for users to read.
Finally, the cgi-bin script has to update the table of contents file to provide access to the new message. You can use a text parsing program called by the cgi-bin script to handle this task. One approach would be to have an HTML comment in the table of contents. When a post is being added to the table of contents, the comment line is looked for. The location of this comment line is returned from the parsing program to the cgi-bin script. Once it's found, the important information for the new article is put in, and the rest of the table of contents is appended. Using the HTML comment, you can also put new messages at the end of the table of contents.
Another approach would be to hard code the number of lines that lead up to the beginning of the table of contents. Suppose that your table of contents Web page uses 10 lines to create the general appearance of the discussion topics. After the new message has been saved, you read it in, and ignore the first 10 lines. Next, you add in the information for the new message, and then append the existing information to the end.
With the basic components laid down, you can create a very simple and basic Web-based discussion group. This discussion group lacks a number of useful items that non-Web discussion groups have, however. An obvious feature that's lacking is the capability to reply to the current message. Another feature that is missing is the capability to manage the messages themselves. You can also add intelligent parsing to the Web-based discussion group. The following sections explain the general approaches to implementing each of these features.
A feature that's very obviously missing from the current outline of a discussion group is the ability to reply to a message. With the current approach, there is no easy method of allowing replies. The best way to enable replies is to embed posting abilities within each message. At the bottom of each article, you could have fill-out forms with the original message. When readers want to reply to the message, they edit the fill-out forms, and then click a button. Make the fill-out form field variable names the same as those on the posting page.
With this approach, one cgi-bin script could handle both new postings and replies. When someone wants to reply to an article, the form field values are already entered, and the writer just has to modify them. Though this approach will clutter up the individual messages, it does provide an easy reply mechanism.
Another useful capability that's lacking in the currently outlined Web-based discussion group is any type of article maintenance. Theoretically, if a discussion group were never cleaned out, you could read all the articles ever posted. With many messaging systems, however, each forum has a certain amount of space allocated to it. When someone tries to create a new message, and can't, an old message is deleted to make room. Although adequate in most cases, this system suffers when a discussion area stagnates. In such a situation, the discussion group could contain articles that are many months old simply because no new articles have been posted to get rid of the old ones.
Another approach to managing disk space and discussion groups is the Usenet approach. In this approach, each site mandates a maximum amount of time that any article can be kept on the system. This limit allows for an automatic expiring mechanism that is uniform and consistent. Although this limit means there are no truly old articles, it also means that some discussion groups might not have any articles at all. On Usenet, it's not unusual to see stagnant newsgroups with absolutely no content.
The current example of a discussion group has no article management facility. You can add this type of functionality by creating a separate discussion group Web page. Make the page password-protected so that only you, the board maintainer, can access it. This private Web page should also have a separate cgi-bin file to do all the dirty work. The private Web page need not be elaborate, after all, it's intended only for one person. However, it should be functional and easy to use, for your sake. It should have form fields that give you a number of useful options, such as listing articles by an author, by a date, or by subject. It should also have links and form fields for deleting articles.
The private cgi-bin script is perhaps much more important than the Web page itself. This script should be capable of doing a great deal of text processing for the discussion group Web page. The purpose of most of the cgi-bin script will be to give you, the board maintainer, meaningful information. It should return items such as message subject, author, URL, and message number. The rest of the script file will focus on the information. The file should be capable of deleting files entirely or deleting individual lines from a file. When you want to delete an article, the cgi-bin file should already know which article you want deleted. Next, it should go out and remove the article's file. Finally, it should update all messages that replied to that article. Obviously, this process involves a great deal of talking and passing of information between the private Web page and cgi-bin script. As a result, some Web-based discussion groups put the two, the cgi-bin script and the private Web page, together.
Even though you've probably already implemented a simple routine to check the form fields of a new article, you can do much more. Although you can't reasonably check for a valid user name or e-mail address, you can enhance each post. You can run a simple text parser through the form field values of a post and check for certain strings of characters. For example, you could parse URLs and put HTML code around them to enable readers to directly access a referenced URL.
Similarly, you could modify your parser to look for certain words. Suppose that your Web-based discussion group is about your company's products. You might want to parse the form field values and look for your competitor's names. Although you might not necessarily reject the article, it does give you the ability to filter it out, if necessary. Or suppose there's a discussion group that's geared towards children. You might want to have a text parser to filter out any profanity, so as to not offend anyone. Depending on the level of parsing that you want to implement, the parsing routine could come before, during, or after the post is to be saved in a file.
Any Web-based discussion group is at the mercy of the person whose account it's residing in. If it's a fan-oriented board and only one person is running it, she'll have absolute say over what should, and shouldn't, be on the board. Similarly, corporate-backed discussion groups are likely to be filled mostly with praise for the company. However, you should know about some aspects of maintaining a discussion group and its topics.
Even though you're the person responsible for the discussion group, you should be willing to keep an open mind. For example, if you feel that an actor was bad in a show's performance, don't delete contrary viewpoints. Be willing to let posts that you personally disagree with get posted, and stay posted, on the board. A healthy debate is good for any conversation, Web-based or not. Try to remember the focus of the discussion group, and don't let your own feelings get in the way. That's not to say that you should let all conflicting viewpoints in, just be willing to accept them. It's no use to set up a discussion group about a TV show if everybody who participates complains about it. Rather, try to strike a balance between pro and con positions.
Corporate discussion groups are different. Although you might be willing to let competitors' products get mentioned, your supervisor might not want you to. She might tell you to filter out any messages that portray the competition in a good light. Customer complaints usually provide the balance to company praise in corporate discussion groups. Chances are pretty good that no matter what company you're with, it will want to hear complaints. Your company can't improve its product if it doesn't know what the customer wants. As a result, even if the complaints are fierce, they probably won't be filtered out.
Another important aspect to remember in being the maintainer of a discussion group is to be polite. Even though you (usually) wield supreme power over your board, you shouldn't flex your muscle. If an article's content was inflammatory and unreasonable, don't respond in kind. Politely reason with the person and try to persuade him to resubmit his opinions in a more suitable manner. Don't be too quick to say, "Well, I run things around here, and I say you're wrong, so you're wrong!"
Your discussion group will become more popular if people know it's okay for them to express themselves openly and freely. Let the users complain about your system, your discussion group, even about you. That's what an open discussion is all about. Even though you are in complete control of a discussion group, you should be careful when exercising your position. Supreme executive power derives from a mandate from the users. It helps to keep the participants happy, even when you might not be.
For people who want to encourage discussion about a particular topic, a Web-based discussion group is a good idea. It gives people interested in the same topic a place to go to talk about whatever they want. However, implementing such a board can be difficult, depending on what functionality you want to offer. A discussion group where people simply write new messages and have no reply capability is easy to implement. A discussion group that more closely resembles other discussion forums, such as Usenet, requires more work. Such a board requires more text parsing and cgi-bin scripting, but the payoffs can make the extra work worthwhile.
Also, creating a discussion forum isn't simply putting Web pages and cgi-bin scripts together. It's also a fair amount of responsibility. You should be patient and understanding and be willing to listen to complaints. Depending on the purpose of your discussion group, you might not be able to, but you should at least try. The main goal of any discussion group is to allow people a chance to talk about whatever the board was created to talk about. If it's about a TV show, let the users talk about the TV show. If it's about a product, let the users talk about the product. Don't, however, get so engaged in a discussion that you let your personal feelings get in the way. If people disagree with you on something, don't condemn or censor them; let them have their say.