A Big-Category-Tree Browsing Method

Dr. Chin-Liang Chang
Nicesoft Corporation

Email: nicesoft2009@gmail.com

July 2005

HOW DO YOU PAY ?

We charge $40 per hour. We'll give you total estimated hours to complete a JavaScript code for the interface, and to complete a database of links of any sub-directory in your big directory tree.

If you would like to have this service, please click HERE.

1. INTRODUCTION

Internet browsing is one kind of searching, where the key words are names of categories (directories), or names of topics. A user can click onˇ§directoryˇ¨ and will get a list of ˇ§subdirecories.ˇ¨ The user then click on one of the subdirectories and will get another list of ˇ§subdirectories.ˇ¨ This process can continue until the user gets a list of websites in the category specified by all the subdirectories the user has clicked.

For example, nodes A, B, C, E, F, ˇK denote directory (subdirectory), topic (subtopic), or category (subcategory) names. The children nodes B, C, D, E, F are the subdirectories of the parent node A. Similarly, the children nodes G, H are the subdirectories of the parent node B, etc. A leaf node is a node that does not have children nodes. For each node in the tree, it may be or may be not associated with information items (e.g., websites, product descriptions). A leaf node must be associated with information items.

If we use indentations for subdirectories, the above tree can also be represented as

A
--B
----G
----H
------S
------T
--C
----I
----J
--D
----K
------U
------V
----L
------W
--E
----M
----N
----O
--F
----P
------X
------Y
----Q
----R

Directories or categories trees appear on many websites in Internet. For example, the following top level directories, where a number in the parenthesis indicates the number of subdirectories:

Arts & Humanities
--Art History (1684)
--Artists (1945)
--Arts Therapy
--Awards (173)
--Booksellers
--By Region (53400)
--Censorship (13)
--Chats & Forums (20)
--Crafts (962)
--Criticism and Theory (32)
--Cultural Policy
--Cultures and Groups (279)
--Design Arts (6520)
--Education (586)
--Events (277)
--Humanities (52751)
--Institutes (32)
--Job and Employment Resources (36)
--Museums, Galleries, and Centers (938)
--News and Media (291)
--Organizations (294)
--Performing Arts (7756)
--Reference (24)
--Shopping and Services
--Visual Arts (18990)
--Web Directories (54)
Business & Economy
Computers & Internet
Education
Entertainment
Government
Health
News & Media
Recreation & Sports
Reference
Regional
Science
Social Science
Society & Culture

Another example is Amazon's category tree for books. At the top of the tree, there are the following categories:

Arts & Photography
--Architecture
--Artists, A-Z
--Design & Decorative Arts
--Drawing
--Fashion
--History & Criticism
--Instructional & How-To
--Museums & Collections
--Other Media
--Painting
--Performing Arts
--Photography
--Reference
--Religious
--Schools, Periods & Styles
--Sculpture
Audiobooks
Biographies & Memoirs
Business & Investing
Children's Books
Comics & Graphic Novels
Computers & Internet
Cooking, Food & Wine
Crafts & Hobbies
Entertainment
Gay & Lesbian
Health, Mind & Body
History
Home & Garden
Literature & Fiction
Mystery & Thrillers
Nonfiction
Outdoors & Nature
Parenting & Families
Politics
Professional & Technical
Puzzles & Games
Reference
Religion & Spirituality
Romance
Science
Science Fiction & Fantasy
Self-Help
Sports
Teens
Textbooks
Travel

Clicking on "Arts & Photography", we will get the sub-categories shown in the above by indented lines under "Arts & Photography". A leaf node may display some books' information items such as "book title", "author", "paperback or hardcover", "new or used", "price", "rating" and "shipping information".

Considering the variety of books Amazon sells, the above category tree is also very large./p>

A traditional approach to browse a huge category tree is limited, tedious and time-consuming. From the top of the tree, the user clicks on a node and a web server returns several sub-nodes. He/she clicks one of the sub-nodes and get more sub-nodes. Besides sub-nodes, he/she may see and read some information itemss. He/she hopefully finds the ones he/she wants. Otherwise, he/she may continue to go up/down the category tree to try to find the information he/she wants. For a big category tree, it is easy for the user to get lost. Considering that a big category tree may contain more than 100,000 information items, it is time-consuming to find and read some of these information items.

Nicesoft Corporation has developed a patent-worthy "Big-Category-Tree Browsing" method, which has been tested on the database of 300,000 websites. Using this method, the user browse a big category tree locally on his/her PC. When the user browses and expands the tree, he/she may click on some nodes associated with "information items." Only when the user clicks on one of the information items, he/she need to go a server's database to fetch the information.

2. PROGRAM DEVLOPMENTS

From a category tree, we'll generate JavaScript or PHP codes automatically. If a user wants to browse the category tree locally, JavaScript codes will be generated. Otherwise, PHP codes will be generated to put them into a server. We need to emphasize that codes are generated automatically because they are impossible to be written by hand due to the huge size of a big-category-tree.

In order to test our programs which implemented the ˇ§Big-Category-Tree Browsingˇ¨ method, we need to get a real case of a big category tree. In 2005, the Google website still provided a directory browsing. We wrote a program to crawl the directory tree and collect the directory and sub-directories and 300,000 information items (website titles and website addresses).

The above crawling program generated files for the big category tree. These files were read in and processed by another program to them into Visual Basic databases and a Mysql database. Finally, a program takes the Visual Basic databases and automatically generates JavaScript or PHP codes, which are included in our Big-Category-Tree Browsing webpages.

3. USER INTERFACE

The first webpage for our Big-Category-Tree Browsing is shown Figure 1.


Figure 1

For this category-tree we tested, there are 15 top categories (Directories) Arts, Business, Computers, Games, Health, Home, Kids and Teens, News, Recreation, Reference, Regional, Science, Shopping, Society, and Sports. The user can click any of these categories.

For example, if he/she click ˇ§Artsˇ¨ of Figure 1, he/she will get the screen shown in Figure 2.


Figure 2

Note that ˇ§Artsˇ¨ is changed into ˇ§redˇ¨ color, indicating that it has been clicked (chosen). The second column in Figure 2 shows all the sub-categories of the category ˇ§Artsˇ¨. There are ˇ§blackˇ¨ and ˇ§greenˇ¨ colors for these sub-categories. A sub-category with the black color means that it has sub-categories and can be further expanded. A sub-category with the green color means that it does not have sub-categories and cannot be further expanded.,/p>

The button in front of a sub-category indicates that there are ˇ§information itemsˇ¨ associated with the sub-category. At any time, the user can click any button to see its information items.

For example, if he/she clicks the ˇ§Graphic Designˇ¨ button of Figure 2, he/she will see the screen shown in Figure 3.


Figure 3

If you open the ˇ§Graphic Designˇ¨ website, youˇ¦ll see the webpage as shown in Figure 4.


Figure 4

Please go back to Figure 2 to explore more sub-categories. For example, if you click the ˇ§Designˇ¨ sub-category in the second column of Figure 2, youˇ¦ll get the screen shown in Figure 5.


Figure 5

Clicking on the ˇ§Industrialˇ¨ sub-category in the third column of Figure 5, youˇ¦ll get the screen shown in Figure 6.


Figure 6

Clicking the button associated with the ˇ§Portfoliosˇ¨ sub-category in the forth column of Figure 6, youˇ¦ll get the screen shown in Figure 7.


Figure 7

If you open the ˇ§Boles, Tysonˇ¨ website, you can see his portfolios for the sub-category "Arts/Design/Industrial/Portfolios" as shown in Figure 8.


Figure 8