Winny(Kaneko, 2005), a P2P file sharing software
with over 200,000 users in Japan, uses three key-
words for a background file downloading process.
The keywords are also used for P2P clustering, but
the users have to set the keywords in advance. In
the Winny network, the files opened to be shared by
a certain node is cached at the other nodes, such as
Freenet(Clarke et al., 2002), and the network load is
heavy. Therefore, in order to realize more efficient
file sharing, Winny constructs the layered P2P net-
work based on each node’s connection speed, for ex-
ample, up to 64kbps, xDSL or fiber. The P2P URL
sharing distributes much smaller data such as URLs,
so it have to make much of whether the reboot inter-
vals of the nodes are long or whether the nodes have
a fixed global IP address, rather than the connection
speed.
7 CONCLUSION
Exchanging URLs which users are viewing in real-
time brings hot topics, which are attracting consider-
able attention, without any keywords explicitly desig-
nated. The effective way to realize it is that the client
side subsystem should be implemented as an add-on
program for web browsers, and the server side as P2P.
In the current version of our system, the server side
is implemented as the server centric CGI programs,
but the P2P could take the place of it by preserving
the interface, which consists of HTTP and CGI, be-
tween the server side and the client side. The pilot
experiment showed the usefulness of our proposal. In
addition, the system for exchanging URLs could be
applied to other communication tools as their infras-
tructure.
8 FUTURE WORKS
The top pages of portal sites tend to be placed at the
higher rank. This problem could be resolved through
the user interface of the extension. The users disable
the appearance of such entries in the rankings, for ex-
ample by a right click on a target entry, and exchange
the disabled portal sites via the server side subsystem.
The strategy of clustering users is quite important
and difficult. Creating clusters is an effective way
to reduce the number of connections, the traffic on
P2P network, and the computation of the similarity of
users. The separation of clusters, however, may make
users lose opportunity to be offered interesting web
pages(Linden et al., 2003). In ordinal recommenda-
tion systems such as e-commerce marketing systems,
each user’s purchases in a day may be usually less
than ten items, even if they are heavy users of the sys-
tem. But web pages each user views in a day are at
least ten pages, and heavy web users view more than
100 pages. In addition, there are web users obviously
more than e-commerce site users. Reduction by clus-
ters must be introduced for effective exchange of web
histories. Parameters for establishing clusters, such
as the number of connections of each node, the limit
of hops of web history transfer, and so on, could be
found through simulation by using huge amount of
proxy logs of our university.
The identity of URLs is also important. While
CMSs generate different URLs to each blog entry,
known as permalink, visitors can read the entries not
only in their permalinks but also in the summary
pages of recent entries, of the day, of the month and
so on. Consequently, a certain entry has many URLs
where it can be read. It is desirable that these URLs
are dealt with as the same. This problem could be re-
solved by the introduction of scores between similar
URLs. The URLs which indicate the same entry are
similar, because most CMSs add the date of the entry
to its permalink. In addition, the similarities between
URLs would help us to cluster users. Users who often
see different entries of the same blog should be placed
in the same cluster.
In the view of the privacy, it is important not to
send URLs which should not be shared. Our exten-
sion does not send any URLs in the default setting and
explicitly shows its setting, whether sending URLs is
turned on or off, by the correspondent icons. On the
other hands, the web pages without appropriate access
controls, such as password authentications or Limit
directives of web servers, must be seen by outsiders.
It is equivalent to being opened even if the URLs are
not opened. This problem is known as “Google hack-
ing” among security experts. Appropriate settings of
access controls would make the problem trivial.
Spammers may attack this system by sending
URLs which they want to advertise. This problem
is known as “shilling attacks”. However, since the
information for clustering is URLs themselves, the
spammers may be classified to a spammers’ cluster
and cannot influence the ordinal users. The clustering
of spammers would be inspected through simulation.
REFERENCES
Alaniz, A., Truong, K. N., and Antonio, J. (2003). Automat-
ically Sharing Web Experiences through a Hyperdoc-
ument Recommender System. In Proceedings of the
14th ACM Conference on Hypertext and Hypermedia,
pages 48–56.
Clarke, I., Miller, S. G., Hong, T. W., Sandberg, O., and
Wiley, B. (2002). Protecting free expression online
with Freenet. IEEE Internet Computing, 6(1):40–49.
WEBIST 2006 - WEB INTERFACES AND APPLICATIONS
358